Practical compressed string dictionaries
نویسندگان
چکیده
منابع مشابه
Practical compressed string dictionaries
The need to store and query a set of strings – a string dictionary – arises in many kinds of applications. While classically these string dictionaries have accounted for a small share of the total space budget (e.g., in Natural Language Processing or when indexing text collections), recent applications in Web engines, Semantic Web (RDF) graphs, Bioinformatics, and many others, handle very large...
متن کاملCompressed String Dictionaries
The problem of storing a set of strings – a string dictionary – in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing or for indexing text collections), recent applications in Web engines, RDF graphs, Bioinformatics, and many others, handle very large string dictionaries, whose ...
متن کاملCompressed Matching in Dictionaries
The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are o...
متن کاملSearching in Compressed Dictionaries
The problem of Compressed Pattern Matching , introduced by Amir and Benson [1], is of performing pattern matching directly in a compressed text without any decompressing. More formally, for a given text T , pattern P and complementary encoding and decoding functions E and D, respectively, our aim is to search for E(P ) in E(T ), rather than the usual approach which searches for the pattern P in...
متن کاملBootstrapping pronunciation dictionaries: practical issues
Bootstrapping techniques are an efficient way to develop electronic pronunciation dictionaries [1, 2], but require fast system response to be practical for medium-to-large lexicons. In addition, user errors are inevitable during this process, and it is useful if automatic means can be used to assist in the search for such errors. We describe how the Default&Refine grapheme-tophoneme rule extrac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Systems
سال: 2016
ISSN: 0306-4379
DOI: 10.1016/j.is.2015.08.008